[improve](streaming-job) async chunk splitting for StreamingInsertJob by JNSimba · Pull Request #63079 · apache/doris

JNSimba · 2026-05-08T09:38:05Z

Summary

StreamingInsertJob (CDC FROM-TO and cdc_stream TVF paths) used to call splitChunks() synchronously inside CREATE STREAMING JOB, asking cdc_client to cut every chunk of every table before returning. On large/non-uniform PK tables this can take 30+ minutes — far beyond the BE→cdc_client BRPC 60s timeout, and the SQL client blocks the whole time.

This PR makes splitting tick-driven by the FE scheduler:

CREATE returns immediately; no more synchronous splitChunks().
Each scheduler tick advanceSplits() issues one short fetchSplits RPC (default batchSize=100) and pushes that batch into remainingSplits. Tasks dispatch as soon as the first batch lands, so end-to-end first-byte latency stays close to flink-cdc's.
cdc_client is stateless — every RPC reconstructs ChunkSplitter from the (currentSplittingTable, nextSplitStart, nextSplitId) triple supplied by FE; flink-cdc internals are untouched (uses the public ChunkSplitter API only).
Crash recovery uses three sources of truth:
- editlog persists committedSplitProgress (3-field SplitProgress) + existing chunkHighWatermarkMap / binlogOffsetPersist
- streaming_job_meta system table holds full chunk_list JSON per table (UPSERT each advanceSplits)
- cdc_client memory holds nothing
Both FROM-TO (multi-table) and TVF (single-table) paths share the same SourceOffsetProvider#initSplitProgress / noMoreSplits / advanceSplits interface; StreamingJobSchedulerTask.handlePendingState pre-advances one batch so the first task doesn't wait a full max_interval.

Detailed design lives in the linked plan.

Changes

fe-common: FetchTableSplitsRequest adds nextSplitStart (Object[]) / nextSplitId / batchSize.
fe-core:
- SourceOffsetProvider adds 3 default methods: initSplitProgress / advanceSplits / noMoreSplits.
- JdbcSourceOffsetProvider implements the async state machine (committed/cdc SplitProgress, advanceSplits, dedup, system-table UPSERT, replay path).
- JdbcTvfSourceOffsetProvider.initOnCreate no longer pre-splits; relies on the same scheduler tick path.
- StreamingInsertJob carries syncTables (@SerializedName("st")); initSourceJob / initInsertJob initialize SplitProgress; advanceSplitsIfNeed() mirrors fetchMeta error handling (PAUSE on failure).
- StreamingJobSchedulerTask.handlePendingState / handleRunningState call advanceSplitsIfNeed() each tick; PENDING handler pre-advances and short-circuits if PAUSED.
- StreamingJobUtils.upsertChunkList covers id-allocation via MAX(id)+1 lookup.
cdc_client/JdbcIncrementalSourceReader: getSourceSplits() rebuilt around the public ChunkSplitter API (no more in-memory loop / reflection hack).

Tests

SplitProgressTest — copy/null-field semantics.
JdbcSourceOffsetProviderAsyncSplitTest — covers advanceSplits (first call / continue same table / cross-table switch / dedup / empty batch), noMoreSplits, updateOffset committed-progress advancement (mid-chunk vs last chunk vs replay missing-split path), and computeCdcRemainingTables.
Regression case (separate commit, not in this PR yet): test_streaming_postgres_job_async_split.groovy — 100 rows × snapshot_split_size=5 → 20 splits across multiple ticks; asserts CREATE returns < 30s, full snapshot count + DISTINCT id, then INSERT/UPDATE/DELETE in binlog phase.

Test plan

mvn test -pl fe/fe-core -Dtest=JdbcSourceOffsetProviderAsyncSplitTest,SplitProgressTest
Run test_streaming_postgres_job_async_split regression locally
PG/MySQL non-uniform PK large-table manual test: confirm CREATE returns in seconds, SHOW STREAMING JOB immediately reflects the new job, snapshot completes, binlog phase healthy
FE restart mid-snapshot: confirm cdc-side resumes from system-table position, no duplicate / lost rows
cdc_client kill mid-snapshot: confirm FE retries on next tick, no duplicate / lost rows
cdc_stream TVF + StreamingInsertJob path: confirm CREATE no longer blocks

hello-stephen · 2026-05-08T09:38:11Z

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

What problem was fixed (it's best to include specific error reporting information). How it was fixed.
Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
What features were added. Why was this function added?
Which code was refactored and why was this part of the code refactored?
Which functions were optimized and what is the difference before and after the optimization?

Copilot

Pull request overview

This PR moves StreamingInsertJob (CDC FROM-TO and cdc_stream TVF) snapshot chunk splitting from a synchronous CREATE STREAMING JOB path to an incremental, scheduler-tick-driven flow. The goal is to avoid long blocking CREATE times and BRPC timeouts on large / skewed PK tables by fetching snapshot splits in small batches and persisting progress for recovery.

Changes:

Adds split-progress APIs to SourceOffsetProvider and implements an async split state machine in JdbcSourceOffsetProvider (plus new FE tests).
Introduces FetchTableSplitsRequest fields to drive stateless, resumable split generation (nextSplitStart/nextSplitId/batchSize) and rebuilds cdc_client split fetching around flink-cdc ChunkSplitter.
Persists per-table chunk lists incrementally via StreamingJobUtils.upsertChunkList, and advances splits each scheduler tick (including a pre-advance in PENDING).

Reviewed changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 7 comments.

Show a summary per file

File	Description
fs_brokers/cdc_client/src/main/java/org/apache/doris/cdcclient/source/reader/JdbcIncrementalSourceReader.java	Reworks `/api/fetchSplits` handling to drive flink-cdc `ChunkSplitter` directly (stateless batch split generation).
fe/fe-core/src/test/java/org/apache/doris/job/offset/jdbc/SplitProgressTest.java	Unit tests for `SplitProgress` default state and deep-copy semantics.
fe/fe-core/src/test/java/org/apache/doris/job/offset/jdbc/JdbcSourceOffsetProviderAsyncSplitTest.java	Unit tests covering async split advancement, dedup, noMoreSplits, and committed-progress advancement.
fe/fe-core/src/main/java/org/apache/doris/job/util/StreamingJobUtils.java	Adds per-table chunk_list UPSERT support with id reuse / allocation.
fe/fe-core/src/main/java/org/apache/doris/job/offset/SourceOffsetProvider.java	Adds default split-progress hooks (`initSplitProgress`, `advanceSplits`, `noMoreSplits`).
fe/fe-core/src/main/java/org/apache/doris/job/offset/jdbc/JdbcTvfSourceOffsetProvider.java	Removes create-time pre-splitting; re-init split progress on replay; relies on scheduler-driven split fetching.
fe/fe-core/src/main/java/org/apache/doris/job/offset/jdbc/JdbcSourceOffsetProvider.java	Implements async split progress, scheduler-driven split fetching, persistence to system table, and restart replay logic.
fe/fe-core/src/main/java/org/apache/doris/job/extensions/insert/streaming/StreamingJobSchedulerTask.java	Calls `advanceSplitsIfNeed()` each tick and pre-advances once in PENDING before dispatch.
fe/fe-core/src/main/java/org/apache/doris/job/extensions/insert/streaming/StreamingInsertJob.java	Persists `syncTables`, initializes split progress on CREATE, and adds `advanceSplitsIfNeed()` that pauses job on failure.
fe/fe-common/src/main/java/org/apache/doris/job/cdc/request/FetchTableSplitsRequest.java	Adds `nextSplitStart`, `nextSplitId`, and `batchSize` fields to support resumable batched split fetching.

Comments suppressed due to low confidence (1)

fe/fe-core/src/main/java/org/apache/doris/job/offset/jdbc/JdbcTvfSourceOffsetProvider.java:320

replayIfNeed() comment still says snapshot splits in the meta table are "written by initOnCreate", but initOnCreate() is now an intentional no-op and meta writes come from scheduler-driven advanceSplits()/upsertChunkList. Updating this comment will avoid confusion when debugging TVF recovery behavior.

        // Re-init transient split progress fields lost across FE restart.
        // syncTables itself is persisted on StreamingInsertJob; cdcSplitProgress is rebuilt empty
        // here and advanceSplits will resume from the system table on next tick.
        if (cdcSplitProgress == null) {
            initSplitProgress(job.getSyncTables());
        }
        if (currentOffset == null) {
            // No committed txn yet. If snapshot splits exist in the meta table (written by
            // initOnCreate), restore remainingSplits so getNextOffset() returns snapshot splits
            // instead of a BinlogSplit (which would incorrectly skip the snapshot phase).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

JNSimba · 2026-05-11T06:06:36Z

/review

github-actions

I found one additional blocking issue beyond the existing review threads.

Critical checkpoint conclusions:

Goal/test coverage: the PR moves CDC chunk splitting from CREATE-time blocking to scheduler-driven async fetching and adds focused unit tests, but the TVF FE-restart path is not covered and can stop fetching the remaining chunks.
Scope: the change is focused on async split progress, though it touches both FROM-TO and TVF paths.
Concurrency: existing threads already cover the broad splitsLock/RPC concern; I did not add a duplicate. The new issue is lifecycle/replay state, not a new lock-order finding.
Lifecycle/replay: blocking issue found for cdc_stream TVF restart because transient cachedSyncTables is not restored.
Configuration/compatibility: no new config or wire-incompatible persistent format issue found beyond the added optional request fields.
Parallel paths: FROM-TO replay initializes split progress, but the TVF override does not do the equivalent unconditionally.
Testing: unit tests cover the base provider state machine, but miss TVF replay after FE restart. No end-to-end regression result is included in this PR.
Observability/performance/transactions: no additional distinct issue found beyond existing review threads.
User focus: no additional user-provided review focus was supplied.

Copilot

Pull request overview

Copilot reviewed 10 out of 10 changed files in this pull request and generated 8 comments.

JNSimba · 2026-05-12T11:07:50Z

run buildall

JNSimba · 2026-05-12T11:08:09Z

/review

github-actions

Reviewed the full PR diff and existing review threads. I found two additional correctness issues that are not duplicates of the existing comments.

Critical checkpoint conclusions:

Goal/test: The PR moves CDC snapshot split generation to scheduler-driven batches and adds tests for the base async state machine, but the cdc_stream TVF path still has an untested concurrent update path and the split-key type cache lacks collision coverage.
Scope/focus: The implementation is mostly focused, but it now routes both non-TVF and TVF CDC through shared async splitting state, which exposes subclass state-management gaps.
Concurrency: Blocking issue found. JdbcTvfSourceOffsetProvider.updateOffset() mutates inherited split lists/maps without splitsLock while scheduler-driven advanceSplits() and inherited readers use that lock.
Lifecycle/restart: Existing review threads already cover TVF restart/cache restoration concerns; I did not duplicate them.
Config: New CDC RPC/batch configs are present with descriptions and are read dynamically.
Compatibility/protocol: FE now sends batchSize and resume fields; existing threads already cover composite-boundary contract concerns.
Parallel paths: MySQL and generic JDBC split fetching were both updated; the cache-key collision issue affects the generic JDBC path and should be considered for MySQL if chunk key/database identity can vary.
Tests: Added tests cover base provider state, but no TVF concurrency/restart continuation test or cache-collision test covers the issues below.
Observability: Logs were added around split fetch and progress; no additional blocker found.
Persistence/transactions: Snapshot progress is persisted through the meta table/EditLog or txn attachment depending on path; existing unresolved threads cover restart truncation risks.
Performance: Existing review threads already cover long lock holds and repeated sorting; I did not duplicate them.

User focus: no additional user-provided review focus was supplied.

Copilot

Pull request overview

Copilot reviewed 14 out of 14 changed files in this pull request and generated 5 comments.

…k splitting

JNSimba · 2026-05-12T12:20:42Z

run buildall

hello-stephen · 2026-05-12T12:39:38Z

FE UT Coverage Report

Increment line coverage 42.86% (111/259) 🎉
Increment coverage report
Complete coverage report

… fix unevenly-cut resume

JNSimba · 2026-05-12T12:57:05Z

/review

github-actions

Automated review result: request changes. I did not add new inline comments because the blocking findings I confirmed are already covered by existing review threads and should not be duplicated.

Critical checkpoint conclusions:

Goal/test proof: The PR aims to move CDC snapshot split generation to scheduler-driven batches and add resume/restart coverage. Tests were added, but current code still has unresolved correctness and concurrency concerns already raised in inline threads, including snapshot truncation/races and split-fetch batching behavior.
Scope/focus: The feature is focused, but the change is broad across FE scheduling, persisted split metadata, CDC client split generation, and TVF recovery; several edge paths still need tightening before merge.
Concurrency: Not OK. Existing threads cover split-state access/locking issues and long blocking RPC/internal SQL work while holding splitsLock. These can delay task commits/scheduling or race with TVF updateOffset paths.
Lifecycle/restart: Not OK. Existing TVF restart comments cover mid-snapshot replay/resume hazards where transient split progress/table cache reconstruction can truncate snapshot progress after restored chunks are consumed.
Configuration: A new dynamic FE batch-size config is read at split-fetch time, which is appropriate, but existing comments note the batch contract/enforcement is still incomplete in CDC client paths.
Compatibility/storage format: No new external storage format incompatibility found beyond the new persisted split-progress fields, but replay behavior must be correct across FE restart before this is safe.
Parallel paths: Not fully OK. MySQL, generic JDBC/Postgres, non-TVF, and cdc_stream TVF paths all need consistent batch limits, startup-mode handling, and resume semantics; existing threads identify gaps.
Conditional checks/error handling: Some existing concerns remain around splitId parsing, composite split boundaries, SQL literal construction, and error clarity.
Test coverage/results: Regression and unit tests were added, but coverage does not yet prove the unresolved edge cases from existing threads are fixed; no tests were run by this review.
Observability: Logging was added for split fetch/progress, sufficient for basic diagnosis, but correctness blockers remain.
Transaction/persistence/data correctness: Not OK until the existing replay, upsert, and split-progress concerns are resolved; these can affect snapshot completeness/duplicates after restart or retries.
FE/BE variable passing: The new batchSize field is passed from FE now, but existing comments show downstream enforcement still needs work.
Performance: Existing comments cover heavy work under splitsLock and repeated full-list sorting/materialization; these are relevant for large split counts.

User focus: No additional user-provided review focus was supplied.

Existing inline review threads should be addressed rather than duplicated here, especially the split-state concurrency/locking, TVF restart recovery, SQL upsert safety, batch-size enforcement, split-key/composite-boundary handling, and split metadata determinism issues.

Copilot

Pull request overview

Copilot reviewed 19 out of 19 changed files in this pull request and generated 5 comments.

…plits and sink resolveSplitKeyClass to PG reader

hello-stephen · 2026-05-13T11:24:13Z

TPC-H: Total hot run time: 29471 ms

machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 6f9a19f18902e0385db81c6fca221e56974e0db5, data reload: false

------ Round 1 ----------------------------------
orders	Doris	NULL	NULL	0	0	0	NULL	0	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	17651	3847	3853	3847
q2	q3	10703	872	597	597
q4	4664	460	344	344
q5	7436	1340	1153	1153
q6	208	168	139	139
q7	905	942	756	756
q8	9729	1398	1250	1250
q9	6292	5313	5269	5269
q10	6330	2048	1793	1793
q11	474	275	251	251
q12	686	414	303	303
q13	18194	3272	2780	2780
q14	298	280	267	267
q15	q16	896	864	786	786
q17	1058	1083	758	758
q18	6447	5661	5600	5600
q19	1561	1255	1054	1054
q20	533	406	269	269
q21	4594	2309	1931	1931
q22	473	410	324	324
Total cold run time: 99132 ms
Total hot run time: 29471 ms

----- Round 2, with runtime_filter_mode=off -----
orders	Doris	NULL	NULL	150000000	42	6422171781	NULL	22778155	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	4684	4565	4476	4476
q2	q3	4677	4775	4176	4176
q4	2111	2160	1373	1373
q5	4983	4993	5279	4993
q6	202	176	132	132
q7	2061	1813	1618	1618
q8	3339	3094	3174	3094
q9	8386	8466	8400	8400
q10	4394	4483	4249	4249
q11	603	448	392	392
q12	685	758	512	512
q13	3253	3529	3020	3020
q14	311	305	274	274
q15	q16	758	795	726	726
q17	1398	1358	1346	1346
q18	7900	7078	7131	7078
q19	1162	1195	1196	1195
q20	2232	2284	1955	1955
q21	6152	5252	4791	4791
q22	516	464	402	402
Total cold run time: 59807 ms
Total hot run time: 54202 ms

hello-stephen · 2026-05-13T11:35:09Z

TPC-DS: Total hot run time: 170192 ms

machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 6f9a19f18902e0385db81c6fca221e56974e0db5, data reload: false

query5	4361	665	525	525
query6	327	215	201	201
query7	4310	583	333	333
query8	341	240	218	218
query9	8878	4057	4046	4046
query10	460	358	311	311
query11	5828	2441	2213	2213
query12	175	148	125	125
query13	1262	594	447	447
query14	6294	5409	5119	5119
query14_1	4484	4466	4410	4410
query15	210	203	180	180
query16	995	443	455	443
query17	1107	751	615	615
query18	2623	491	364	364
query19	231	216	168	168
query20	150	140	138	138
query21	224	138	121	121
query22	13597	13479	13241	13241
query23	17159	16269	16074	16074
query23_1	16167	16197	16141	16141
query24	7432	1798	1378	1378
query24_1	1399	1395	1396	1395
query25	588	535	496	496
query26	1321	313	179	179
query27	2711	609	367	367
query28	4443	2000	1983	1983
query29	1027	679	556	556
query30	311	245	204	204
query31	1136	1076	943	943
query32	87	76	78	76
query33	546	396	308	308
query34	1172	1141	650	650
query35	780	801	693	693
query36	1350	1302	1177	1177
query37	162	106	103	103
query38	3207	3133	3083	3083
query39	923	921	892	892
query39_1	868	896	866	866
query40	240	161	142	142
query41	75	69	69	69
query42	115	112	111	111
query43	342	343	304	304
query44	
query45	215	207	197	197
query46	1155	1260	738	738
query47	2303	2255	2150	2150
query48	414	415	305	305
query49	658	573	458	458
query50	751	307	242	242
query51	4304	4283	4275	4275
query52	112	112	98	98
query53	262	307	208	208
query54	331	291	290	290
query55	100	94	96	94
query56	334	344	338	338
query57	1434	1399	1307	1307
query58	306	285	289	285
query59	1632	1653	1439	1439
query60	335	333	318	318
query61	155	153	151	151
query62	665	622	549	549
query63	242	200	203	200
query64	2344	825	660	660
query65	
query66	1701	510	398	398
query67	29890	29203	29792	29203
query68	
query69	444	348	305	305
query70	1006	981	940	940
query71	322	288	276	276
query72	2858	2687	2518	2518
query73	843	784	396	396
query74	5077	4870	4724	4724
query75	2786	2685	2331	2331
query76	2304	1187	805	805
query77	423	450	348	348
query78	12953	12948	12300	12300
query79	1585	1016	750	750
query80	1381	616	495	495
query81	523	279	239	239
query82	1018	163	128	128
query83	318	278	253	253
query84	266	142	111	111
query85	885	495	473	473
query86	449	348	332	332
query87	3436	3350	3208	3208
query88	3668	2700	2673	2673
query89	464	380	336	336
query90	1906	188	186	186
query91	177	183	137	137
query92	76	75	72	72
query93	1260	991	569	569
query94	753	336	300	300
query95	666	362	342	342
query96	1017	769	348	348
query97	2700	2683	2581	2581
query98	240	235	227	227
query99	1104	1088	945	945
Total cold run time: 254897 ms
Total hot run time: 170192 ms

…re table id, add TVF mid-snapshot restart case

…splitter is genuinely mid-flight at restart

JNSimba · 2026-05-14T03:20:55Z

run buildall

…uneven restart-fe case

JNSimba · 2026-05-14T03:22:59Z

run buildall

JNSimba · 2026-05-14T03:25:59Z

/review

Copilot

Pull request overview

Copilot reviewed 22 out of 22 changed files in this pull request and generated 1 comment.

github-actions

I cannot approve this PR yet. I did not add duplicate inline comments because the existing review threads already cover the blocking issues I found, but those blockers still affect the PR's correctness.

Critical checkpoint conclusions:

Goal/test coverage: The PR aims to make CDC snapshot split generation asynchronous and resumable; tests were added for async split, pause/resume, restart, special offset, and TVF paths, but the existing threads identify uncovered failure cases where resume/split progress can still truncate or loop.
Scope/focus: The implementation is focused on streaming CDC split state, but it changes FE scheduling, durable split metadata, FE-BE request contracts, cdc_client split generation, and regression tests together, which increases integration risk.
Concurrency: This path is concurrent between scheduler ticks, task dispatch, and commit/updateOffset paths. Some lock coverage was improved, but existing review threads still identify long RPC/meta-table operations and split-state races that need resolution.
Lifecycle/restart: Restart recovery is central to this change. Existing threads already point out restart cases where transient split progress/table cache and qualified-vs-bare table IDs can prevent fetching the next batch after restored chunks are consumed.
Config: The new streaming_cdc_fetch_splits_batch_size is mutable and passed to FE-BE fetchSplits; however existing comments note batch size is not consistently enforced by cdc_client split generation.
Compatibility/protocol: FetchTableSplitsRequest adds cursor and batch fields. Existing comments cover the single-column boundary assumption and splitId format dependency risks.
Parallel paths: MySQL, generic JDBC/Postgres, and TVF/non-TVF paths are all touched; existing comments identify issues present in both MySQL and generic JDBC paths.
Special conditions: Empty split batches, malformed split IDs, non-initial startup modes, and table-name normalization are all special cases; existing inline threads already cover unresolved risks in these areas.
Persistence/transactionality: The streaming_job_meta chunk_list write is part of durable split state. Existing threads identify unsafe SQL construction/id allocation and durable-state divergence risks.
Data correctness: Because snapshot-to-binlog handoff and restart replay depend on complete split definitions and high-watermarks, the existing persistence/resume issues can cause re-cut, skipped chunks, or snapshot truncation.
Performance: Existing threads cover full-list re-sort/re-materialization and batch-size overrun risks.
Observability: Logging was added around split fetching/replay, and no additional observability issue beyond existing comments was found.
User focus: No additional user-provided review focus was supplied.

Please address the existing inline review threads before this can be considered safe to merge.

JNSimba · 2026-05-14T04:00:14Z

/review

github-actions

Review completed against the current PR diff and existing inline threads. I found a new blocking correctness issue: after a snapshot batch commits, the job can immediately create another task using the just-committed snapshot offset while async split fetching has not produced the next batch yet, causing duplicate snapshot reads. I also found flaky assertions in the new async split regression tests because they stop a background sampler before synchronously verifying the final chunk_list length.

Critical checkpoint conclusions:

Goal and tests: the async split goal is partially covered by new tests, but the task scheduling handoff between committed snapshot batches and the next async fetch still has a correctness gap; some tests are nondeterministic.
Scope/focus: the main implementation is focused on async split fetching and restart support, but the scheduler/task interaction needs one more guard.
Concurrency: split collections are mostly protected by splitsLock in the modified paths; the remaining issue is lifecycle ordering between task success and scheduler-driven advanceSplits, not a raw collection race.
Lifecycle/restart: restart paths were reviewed with the existing comments in mind; no additional distinct restart issue found beyond the scheduling gap and already-known threads.
Configuration: the new batch-size config is wired into FE fetchSplits requests; no additional config issue found.
Compatibility: no new FE/BE protocol compatibility issue found in the current PR file list.
Parallel paths: MySQL, generic JDBC/Postgres, and TVF paths were checked; the sampler-race test pattern appears in multiple new uneven async split tests.
Error handling/observability: no additional distinct error-handling issue found beyond existing review threads.
Persistence/transactions: meta persistence concerns are already covered by existing threads; no additional distinct persistence issue found.
Performance: no additional distinct performance issue found beyond existing batch-size and lock-hold threads.

User focus: no additional user-provided review focus was supplied.

hello-stephen · 2026-05-14T04:24:32Z

TPC-H: Total hot run time: 29585 ms

machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 4a74aa7d6ca3a906b2072cb7ad23758192e19212, data reload: false

------ Round 1 ----------------------------------
orders	Doris	NULL	NULL	0	0	0	NULL	0	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	17606	3916	3814	3814
q2	q3	10730	877	614	614
q4	4664	460	340	340
q5	7437	1323	1149	1149
q6	193	174	143	143
q7	919	946	759	759
q8	9311	1395	1265	1265
q9	5581	5316	5314	5314
q10	6242	2066	1840	1840
q11	470	266	259	259
q12	633	424	291	291
q13	18127	3281	2748	2748
q14	291	282	259	259
q15	q16	901	858	787	787
q17	970	1104	739	739
q18	6485	5792	5591	5591
q19	1339	1269	1140	1140
q20	509	399	255	255
q21	4861	2404	1950	1950
q22	476	373	328	328
Total cold run time: 97745 ms
Total hot run time: 29585 ms

----- Round 2, with runtime_filter_mode=off -----
orders	Doris	NULL	NULL	150000000	42	6422171781	NULL	22778155	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	4808	4934	4824	4824
q2	q3	4640	4776	4219	4219
q4	2118	2226	1397	1397
q5	4962	4982	5244	4982
q6	195	173	140	140
q7	2046	1809	1610	1610
q8	3387	3087	3082	3082
q9	8492	8449	8472	8449
q10	4552	4514	4245	4245
q11	622	455	432	432
q12	707	739	528	528
q13	3303	3597	2975	2975
q14	301	320	274	274
q15	q16	765	945	677	677
q17	1387	1328	1324	1324
q18	8109	7142	7268	7142
q19	1151	1159	1119	1119
q20	2230	2246	1933	1933
q21	6229	5526	4944	4944
q22	569	495	408	408
Total cold run time: 60573 ms
Total hot run time: 54704 ms

hello-stephen · 2026-05-14T04:35:30Z

TPC-DS: Total hot run time: 171857 ms

machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 4a74aa7d6ca3a906b2072cb7ad23758192e19212, data reload: false

query5	4316	654	514	514
query6	406	220	201	201
query7	4304	563	313	313
query8	330	241	244	241
query9	8819	4071	4050	4050
query10	459	345	300	300
query11	5809	2593	2193	2193
query12	189	130	135	130
query13	1294	612	419	419
query14	6435	5410	5084	5084
query14_1	4377	4430	4358	4358
query15	213	204	184	184
query16	1031	463	446	446
query17	1187	776	650	650
query18	2763	507	362	362
query19	233	204	167	167
query20	137	131	131	131
query21	219	139	133	133
query22	13700	14168	14547	14168
query23	17368	16534	16180	16180
query23_1	16179	16335	16301	16301
query24	7424	1712	1310	1310
query24_1	1346	1328	1347	1328
query25	554	475	441	441
query26	1283	326	166	166
query27	2691	581	329	329
query28	4316	1925	1912	1912
query29	969	620	490	490
query30	299	225	198	198
query31	1092	1064	948	948
query32	85	71	72	71
query33	543	339	284	284
query34	1163	1101	642	642
query35	747	792	674	674
query36	1307	1375	1199	1199
query37	152	100	84	84
query38	3180	3131	3045	3045
query39	916	911	884	884
query39_1	865	877	870	870
query40	227	150	129	129
query41	64	60	61	60
query42	109	107	112	107
query43	320	325	280	280
query44	
query45	207	199	194	194
query46	1061	1169	733	733
query47	2273	2351	2247	2247
query48	404	412	299	299
query49	638	528	420	420
query50	720	294	239	239
query51	4327	4240	4224	4224
query52	102	106	132	106
query53	251	284	205	205
query54	311	269	249	249
query55	92	87	86	86
query56	297	311	304	304
query57	1409	1383	1307	1307
query58	294	261	273	261
query59	1559	1632	1407	1407
query60	347	335	316	316
query61	167	158	152	152
query62	663	619	544	544
query63	247	204	211	204
query64	2405	810	674	674
query65	
query66	1698	515	396	396
query67	29917	29899	29781	29781
query68	
query69	470	334	305	305
query70	959	1030	976	976
query71	311	272	266	266
query72	2930	2794	2480	2480
query73	846	720	394	394
query74	5062	4916	4718	4718
query75	2760	2638	2323	2323
query76	2301	1138	757	757
query77	407	430	350	350
query78	12976	12949	12346	12346
query79	1495	971	724	724
query80	1387	582	485	485
query81	524	276	236	236
query82	1033	161	119	119
query83	355	275	250	250
query84	265	135	113	113
query85	929	499	442	442
query86	444	315	317	315
query87	3395	3341	3213	3213
query88	3497	2673	2653	2653
query89	437	384	335	335
query90	1925	193	182	182
query91	182	169	136	136
query92	82	74	73	73
query93	1118	968	564	564
query94	734	335	288	288
query95	684	385	341	341
query96	1047	757	336	336
query97	2718	2717	2607	2607
query98	246	232	233	232
query99	1092	1105	978	978
Total cold run time: 254400 ms
Total hot run time: 171857 ms

hello-stephen · 2026-05-14T04:42:02Z

FE UT Coverage Report

Increment line coverage 39.74% (122/307) 🎉
Increment coverage report
Complete coverage report

hello-stephen · 2026-05-14T04:45:52Z

FE UT Coverage Report

Increment line coverage 39.74% (122/307) 🎉
Increment coverage report
Complete coverage report

… stabilize uneven sampler assertions

JNSimba · 2026-05-14T04:54:24Z

run buildall

JNSimba · 2026-05-14T04:54:38Z

/review

JNSimba · 2026-05-14T04:58:00Z

run buildall

github-actions

I found two additional blocking issues in the async split path. Critical checkpoints: goal is clear (make CDC snapshot splitting scheduler-driven) and partially covered by tests, but TVF restart recovery is still incomplete and can skip remaining snapshot chunks; the change is generally focused, but the new mutable batch-size config lacks validation; concurrency around FE split lists is improved from prior revisions, with remaining restart-state correctness risk noted inline; lifecycle/recovery is involved and still has a TVF cursor restoration bug; a new dynamic config is added but invalid values are not rejected; no storage-format incompatibility found beyond persisted provider fields, which have replay handling for the non-TVF path; parallel FROM-TO and TVF code paths are not fully aligned because TVF replay does not rebuild cdcSplitProgress; tests exist but do not cover the no-committed-txn/partial-meta TVF restart cursor case or invalid batch-size config; observability is adequate for normal RPC failures but not enough to prevent the invalid-config no-progress loop; transaction/persistence correctness is affected for TVF restart as described; no data write/MoW/visible-version BE concerns apply. User focus: no additional user-provided review focus was specified.

github-actions · 2026-05-14T05:07:02Z

@@ -312,9 +300,12 @@ public void updateOffset(Offset offset) {
     */
    @Override
    public void replayIfNeed(StreamingInsertJob job) throws JobException {


Restoring cachedSyncTables alone is not enough for TVF mid-snapshot recovery. This override restores remainingSplits/finishedSplits from streaming_job_meta and committed offsets, but unlike the base JdbcSourceOffsetProvider.replayIfNeed() it never rebuilds cdcSplitProgress from the last restored mid-table split. After FE restart, cdcSplitProgress.currentSplittingTable stays null; because the single TVF table is already present in remainingSplits/finishedSplits, inherited noMoreSplits() returns true and advanceSplitsIfNeed() stops fetching subsequent batches. Once the restored batch is drained, the job can transition to binlog/snapshot completion and skip the rest of the snapshot. Please mirror the base replay cursor restoration (findResumeMidSplit + applySplitToProgress, or equivalent) for the TVF path after restoring split metadata. This is distinct from the earlier cachedSyncTables-null issue because the table cache is now restored, but the async fetch cursor is still missing.

github-actions · 2026-05-14T05:07:02Z

    public static int streaming_cdc_heavy_rpc_timeout_sec = 600;

+    @ConfField(mutable = true, masterOnly = true)
+    public static int streaming_cdc_fetch_splits_batch_size = 100;


This mutable config is used directly as FetchTableSplitsRequest.batchSize, and both cdc_client split loops run only while result.size() < batchSize. If an operator sets this to 0 or a negative value dynamically, cdc_client returns an empty split batch without error; FE then keeps the current table cursor unchanged and will repeatedly see no progress. Please validate/clamp this config to a positive value before sending it (or fail the scheduler tick with a clear error) so an invalid dynamic config cannot wedge async splitting.

hello-stephen · 2026-05-14T05:54:23Z

TPC-H: Total hot run time: 29840 ms

machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 594f7a2cf20541a6b6801defdf4866bece6e4428, data reload: false

------ Round 1 ----------------------------------
orders	Doris	NULL	NULL	0	0	0	NULL	0	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	17708	3955	3939	3939
q2	q3	10710	882	614	614
q4	4661	463	343	343
q5	7449	1321	1130	1130
q6	197	173	144	144
q7	905	959	753	753
q8	9483	1436	1309	1309
q9	6212	5460	5349	5349
q10	6345	2095	1873	1873
q11	480	274	263	263
q12	691	414	288	288
q13	18177	3282	2740	2740
q14	304	286	265	265
q15	q16	911	871	792	792
q17	912	978	798	798
q18	6491	5755	5658	5658
q19	1298	1327	1075	1075
q20	505	411	269	269
q21	4675	2367	1894	1894
q22	470	384	344	344
Total cold run time: 98584 ms
Total hot run time: 29840 ms

----- Round 2, with runtime_filter_mode=off -----
orders	Doris	NULL	NULL	150000000	42	6422171781	NULL	22778155	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	4758	4559	4651	4559
q2	q3	4694	4783	4210	4210
q4	2116	2183	1414	1414
q5	5010	5030	5258	5030
q6	196	180	140	140
q7	2072	1825	1646	1646
q8	3365	3100	3106	3100
q9	8514	8537	8465	8465
q10	4498	4503	4272	4272
q11	600	429	410	410
q12	708	769	529	529
q13	3270	3638	2927	2927
q14	303	311	278	278
q15	q16	804	810	705	705
q17	1356	1332	1361	1332
q18	8099	7217	7059	7059
q19	1165	1179	1153	1153
q20	2283	2275	2020	2020
q21	6179	5511	4889	4889
q22	516	478	404	404
Total cold run time: 60506 ms
Total hot run time: 54542 ms

hello-stephen · 2026-05-14T05:54:37Z

TPC-H: Total hot run time: 29722 ms

machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit 517655e845e90f078dc662c2a85d446052e579f5, data reload: false

------ Round 1 ----------------------------------
orders	Doris	NULL	NULL	0	0	0	NULL	0	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	17685	3924	3945	3924
q2	q3	10698	903	628	628
q4	4673	477	349	349
q5	7628	1337	1136	1136
q6	275	173	137	137
q7	921	965	742	742
q8	10490	1422	1235	1235
q9	6915	5393	5360	5360
q10	6335	2100	1857	1857
q11	467	268	257	257
q12	686	420	288	288
q13	18219	3356	2778	2778
q14	292	286	263	263
q15	q16	904	868	791	791
q17	979	1115	777	777
q18	6575	5685	5614	5614
q19	1189	1258	1053	1053
q20	614	452	294	294
q21	4673	2345	1926	1926
q22	483	388	313	313
Total cold run time: 100701 ms
Total hot run time: 29722 ms

----- Round 2, with runtime_filter_mode=off -----
orders	Doris	NULL	NULL	150000000	42	6422171781	NULL	22778155	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	4681	4508	4484	4484
q2	q3	4681	4835	4204	4204
q4	2150	2175	1436	1436
q5	5022	5042	5307	5042
q6	193	169	139	139
q7	2041	1838	1658	1658
q8	3404	3120	3144	3120
q9	8548	8456	8416	8416
q10	4500	4534	4296	4296
q11	656	439	404	404
q12	695	751	539	539
q13	3275	3563	2983	2983
q14	300	315	273	273
q15	q16	772	768	699	699
q17	1358	1340	1304	1304
q18	7923	7099	7145	7099
q19	1185	1154	1197	1154
q20	2296	2262	1952	1952
q21	6302	5435	4908	4908
q22	537	484	405	405
Total cold run time: 60519 ms
Total hot run time: 54515 ms

hello-stephen · 2026-05-14T06:05:19Z

TPC-DS: Total hot run time: 169096 ms

machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 594f7a2cf20541a6b6801defdf4866bece6e4428, data reload: false

query5	4372	678	518	518
query6	332	216	197	197
query7	4401	547	316	316
query8	333	235	243	235
query9	8850	4101	4099	4099
query10	466	346	308	308
query11	5791	2414	2141	2141
query12	176	129	126	126
query13	1301	628	444	444
query14	6557	5377	5080	5080
query14_1	4396	4392	4383	4383
query15	213	208	197	197
query16	1006	457	439	439
query17	1198	784	643	643
query18	2753	516	364	364
query19	226	216	168	168
query20	143	138	132	132
query21	227	140	121	121
query22	13707	13583	13358	13358
query23	17233	16463	15951	15951
query23_1	16075	16146	16208	16146
query24	7411	1749	1349	1349
query24_1	1379	1355	1359	1355
query25	606	527	473	473
query26	1309	323	173	173
query27	2665	588	339	339
query28	4433	1974	1974	1974
query29	1048	671	545	545
query30	309	238	211	211
query31	1133	1071	952	952
query32	93	79	76	76
query33	597	371	313	313
query34	1180	1089	654	654
query35	781	830	670	670
query36	1311	1337	1205	1205
query37	147	101	87	87
query38	3214	3147	3050	3050
query39	929	928	896	896
query39_1	897	879	861	861
query40	233	151	136	136
query41	66	63	61	61
query42	113	110	108	108
query43	322	320	293	293
query44	
query45	214	204	202	202
query46	1092	1174	720	720
query47	2308	2292	2226	2226
query48	389	423	283	283
query49	641	535	431	431
query50	717	281	223	223
query51	4317	4257	4243	4243
query52	104	111	96	96
query53	254	278	213	213
query54	335	271	246	246
query55	92	91	84	84
query56	294	301	312	301
query57	1393	1402	1284	1284
query58	307	276	273	273
query59	1507	1630	1384	1384
query60	353	341	330	330
query61	161	155	153	153
query62	682	651	568	568
query63	251	203	210	203
query64	2387	812	659	659
query65	
query66	1719	526	400	400
query67	30003	30026	29221	29221
query68	
query69	483	339	305	305
query70	1045	982	1035	982
query71	302	273	257	257
query72	2954	2739	2438	2438
query73	848	743	427	427
query74	5100	4891	4734	4734
query75	2779	2681	2320	2320
query76	2284	1161	759	759
query77	426	418	351	351
query78	12860	12891	12257	12257
query79	1536	976	724	724
query80	1397	567	490	490
query81	504	278	242	242
query82	1319	156	121	121
query83	356	265	247	247
query84	270	140	112	112
query85	927	517	439	439
query86	521	374	300	300
query87	3411	3383	3231	3231
query88	3540	2647	2650	2647
query89	458	399	346	346
query90	1851	182	179	179
query91	179	167	136	136
query92	80	79	72	72
query93	946	977	569	569
query94	709	345	299	299
query95	675	389	338	338
query96	996	813	348	348
query97	2712	2706	2587	2587
query98	239	232	227	227
query99	1123	1142	970	970
Total cold run time: 255188 ms
Total hot run time: 169096 ms

hello-stephen · 2026-05-14T06:05:36Z

TPC-DS: Total hot run time: 172522 ms

machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit 517655e845e90f078dc662c2a85d446052e579f5, data reload: false

query5	4324	664	502	502
query6	334	229	200	200
query7	4268	566	317	317
query8	337	232	218	218
query9	8784	4106	4045	4045
query10	457	359	299	299
query11	5735	2410	2174	2174
query12	178	130	130	130
query13	1317	605	457	457
query14	6800	5427	5103	5103
query14_1	4414	4434	4381	4381
query15	210	209	188	188
query16	1028	468	445	445
query17	1329	756	619	619
query18	2715	486	346	346
query19	222	197	158	158
query20	138	132	127	127
query21	213	138	118	118
query22	13612	13620	13364	13364
query23	17271	16352	16110	16110
query23_1	16131	16115	16200	16115
query24	7481	1766	1361	1361
query24_1	1370	1361	1359	1359
query25	581	503	444	444
query26	1187	319	174	174
query27	2679	586	344	344
query28	4405	1929	1947	1929
query29	984	660	550	550
query30	313	237	199	199
query31	1133	1080	951	951
query32	90	77	73	73
query33	537	354	286	286
query34	1156	1149	640	640
query35	775	792	674	674
query36	1345	1319	1223	1223
query37	154	105	93	93
query38	3220	3147	3084	3084
query39	932	931	904	904
query39_1	878	873	884	873
query40	233	152	139	139
query41	67	64	63	63
query42	114	111	111	111
query43	326	328	291	291
query44	
query45	217	200	194	194
query46	1071	1224	722	722
query47	2340	2302	2171	2171
query48	387	420	291	291
query49	639	538	429	429
query50	744	289	220	220
query51	4338	4281	4261	4261
query52	103	105	93	93
query53	250	279	211	211
query54	310	283	271	271
query55	95	94	86	86
query56	304	322	303	303
query57	1424	1394	1312	1312
query58	298	277	270	270
query59	1572	1653	1449	1449
query60	332	338	328	328
query61	158	155	158	155
query62	682	650	557	557
query63	246	205	210	205
query64	2156	813	672	672
query65	
query66	1638	510	394	394
query67	30233	30050	30044	30044
query68	
query69	454	351	307	307
query70	1054	985	941	941
query71	311	272	283	272
query72	2899	2763	2583	2583
query73	826	741	408	408
query74	5107	4933	4755	4755
query75	2819	2683	2358	2358
query76	2336	1211	769	769
query77	434	451	355	355
query78	12936	13059	12418	12418
query79	1507	1011	710	710
query80	695	617	524	524
query81	459	280	246	246
query82	1388	168	131	131
query83	370	285	263	263
query84	262	142	113	113
query85	931	613	520	520
query86	396	364	356	356
query87	3405	3365	3239	3239
query88	3546	2674	2664	2664
query89	444	389	334	334
query90	1962	186	186	186
query91	177	162	138	138
query92	79	82	73	73
query93	960	968	574	574
query94	552	338	310	310
query95	655	364	346	346
query96	1098	728	344	344
query97	2715	2667	2582	2582
query98	243	228	229	228
query99	1123	1097	964	964
Total cold run time: 254246 ms
Total hot run time: 172522 ms

hello-stephen · 2026-05-14T06:26:04Z

TPC-H: Total hot run time: 29750 ms

machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpch-tools
Tpch sf100 test result on commit bb8ff9a82dd9377cb05b92aa25e48ea7e82a6dc3, data reload: false

------ Round 1 ----------------------------------
orders	Doris	NULL	NULL	0	0	0	NULL	0	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	17612	3939	3935	3935
q2	q3	10729	867	595	595
q4	4665	458	349	349
q5	7452	1353	1129	1129
q6	180	167	136	136
q7	918	957	748	748
q8	9308	1399	1255	1255
q9	5553	5342	5359	5342
q10	6245	2084	1825	1825
q11	465	263	256	256
q12	625	418	299	299
q13	18089	3292	2757	2757
q14	291	282	261	261
q15	q16	904	866	792	792
q17	962	898	783	783
q18	6558	5717	5652	5652
q19	1203	1290	1077	1077
q20	512	390	262	262
q21	4691	2433	1975	1975
q22	484	406	322	322
Total cold run time: 97446 ms
Total hot run time: 29750 ms

----- Round 2, with runtime_filter_mode=off -----
orders	Doris	NULL	NULL	150000000	42	6422171781	NULL	22778155	NULL	NULL	2023-12-26 18:27:23	2023-12-26 18:42:55	NULL	utf-8	NULL	NULL	
============================================
q1	4792	4803	4810	4803
q2	q3	4660	4769	4232	4232
q4	2142	2191	1495	1495
q5	5051	4995	5267	4995
q6	197	170	134	134
q7	2050	1808	1608	1608
q8	3350	3073	3091	3073
q9	8438	8370	8379	8370
q10	4492	4539	4251	4251
q11	642	441	440	440
q12	694	740	513	513
q13	3226	3658	2955	2955
q14	307	295	269	269
q15	q16	771	802	692	692
q17	1505	1330	1263	1263
q18	7954	7194	7119	7119
q19	1169	1123	1197	1123
q20	2248	2215	1971	1971
q21	6191	5456	4955	4955
q22	545	505	424	424
Total cold run time: 60424 ms
Total hot run time: 54685 ms

hello-stephen · 2026-05-14T06:37:00Z

TPC-DS: Total hot run time: 171441 ms

machine: 'aliyun_ecs.c7a.8xlarge_32C64G'
scripts: https://github.com/apache/doris/tree/master/tools/tpcds-tools
TPC-DS sf100 test result on commit bb8ff9a82dd9377cb05b92aa25e48ea7e82a6dc3, data reload: false

query5	4312	656	511	511
query6	317	222	200	200
query7	4232	574	300	300
query8	325	223	218	218
query9	8816	4088	4069	4069
query10	459	370	298	298
query11	5861	2396	2219	2219
query12	187	133	128	128
query13	1284	608	442	442
query14	5901	5343	5402	5343
query14_1	4363	4354	4363	4354
query15	217	209	185	185
query16	1032	458	447	447
query17	1121	750	637	637
query18	2526	504	360	360
query19	222	210	172	172
query20	143	134	132	132
query21	220	147	125	125
query22	13726	14000	14544	14000
query23	17363	16640	16253	16253
query23_1	16312	16282	16411	16282
query24	7521	1746	1367	1367
query24_1	1359	1371	1349	1349
query25	589	525	476	476
query26	1305	326	176	176
query27	2678	594	346	346
query28	4442	1979	1940	1940
query29	1068	668	530	530
query30	308	238	197	197
query31	1101	1076	953	953
query32	87	78	74	74
query33	549	366	312	312
query34	1158	1100	659	659
query35	773	831	683	683
query36	1333	1356	1144	1144
query37	161	112	90	90
query38	3208	3183	3088	3088
query39	923	924	892	892
query39_1	869	872	868	868
query40	235	156	146	146
query41	78	71	60	60
query42	111	110	109	109
query43	332	317	289	289
query44	
query45	207	201	193	193
query46	1063	1132	729	729
query47	2360	2308	2204	2204
query48	376	420	305	305
query49	629	536	430	430
query50	718	291	221	221
query51	4234	4298	4184	4184
query52	106	107	93	93
query53	253	292	206	206
query54	306	286	259	259
query55	92	92	92	92
query56	343	307	310	307
query57	1417	1389	1337	1337
query58	301	272	266	266
query59	1550	1623	1416	1416
query60	346	339	322	322
query61	154	156	151	151
query62	671	618	566	566
query63	236	208	206	206
query64	2447	869	674	674
query65	
query66	1740	504	392	392
query67	29971	29393	29801	29393
query68	
query69	455	349	307	307
query70	1007	987	964	964
query71	325	283	283	283
query72	2904	2726	2485	2485
query73	813	782	431	431
query74	5069	4936	4750	4750
query75	2770	2644	2310	2310
query76	2288	1133	783	783
query77	436	464	352	352
query78	13004	12957	12425	12425
query79	1471	964	704	704
query80	813	611	480	480
query81	480	286	243	243
query82	1348	159	124	124
query83	352	276	262	262
query84	262	139	112	112
query85	919	511	471	471
query86	445	351	309	309
query87	3418	3337	3212	3212
query88	3548	2673	2652	2652
query89	438	387	341	341
query90	1809	194	184	184
query91	180	175	142	142
query92	82	79	72	72
query93	992	970	557	557
query94	600	361	309	309
query95	659	387	352	352
query96	1000	792	339	339
query97	2707	2721	2574	2574
query98	233	234	232	232
query99	1125	1096	979	979
Total cold run time: 253460 ms
Total hot run time: 171441 ms

[feature](streaming-job) async chunk splitting for StreamingInsertJob

d3ade29

JNSimba changed the title ~~[feature](streaming-job) async chunk splitting for StreamingInsertJob~~ [improve](streaming-job) async chunk splitting for StreamingInsertJob May 8, 2026

JNSimba requested a review from Copilot May 11, 2026 03:56

Copilot started reviewing on behalf of JNSimba May 11, 2026 03:57 View session

Copilot AI reviewed May 11, 2026

View reviewed changes

JNSimba requested a review from Copilot May 11, 2026 06:06

Copilot started reviewing on behalf of JNSimba May 11, 2026 06:07 View session

github-actions Bot requested changes May 11, 2026

View reviewed changes

Comment thread fe/fe-core/src/main/java/org/apache/doris/job/offset/jdbc/JdbcTvfSourceOffsetProvider.java Outdated

Copilot AI reviewed May 11, 2026

View reviewed changes

[fix](streaming-job) various fixes for async chunk splitting

ebc5ab1

JNSimba marked this pull request as ready for review May 12, 2026 11:07

github-actions Bot requested changes May 12, 2026

View reviewed changes

Comment thread ...-core/src/main/java/org/apache/doris/job/extensions/insert/streaming/StreamingInsertJob.java

Comment thread ...ient/src/main/java/org/apache/doris/cdcclient/source/reader/JdbcIncrementalSourceReader.java Outdated

JNSimba requested a review from Copilot May 12, 2026 11:22

Copilot started reviewing on behalf of JNSimba May 12, 2026 11:23 View session

Copilot AI reviewed May 12, 2026

View reviewed changes

[fix](streaming-job) review fixes and regression cases for async chun…

109d22c

…k splitting

[fix](streaming-job) cast split key by JDBC driver's natural class to…

b7c5368

… fix unevenly-cut resume

JNSimba requested a review from Copilot May 12, 2026 12:56

Copilot started reviewing on behalf of JNSimba May 12, 2026 12:59 View session

github-actions Bot requested changes May 12, 2026

View reviewed changes

Copilot AI reviewed May 12, 2026

View reviewed changes

[opt](streaming-job) release splitsLock during RPC/UPSERT in advanceS…

8b89cde

…plits and sink resolveSplitKeyClass to PG reader

JNSimba added the dev/4.1.x label May 13, 2026

JNSimba added 2 commits May 14, 2026 11:11

[fix](streaming-job) persist chunk_list before publish + normalize ba…

4a74aa7

…re table id, add TVF mid-snapshot restart case

[test](streaming-job) shrink batch_size in uneven restart-fe case so …

dc0f0f6

…splitter is genuinely mid-flight at restart

[fix](streaming-job) drop unused Stream import; shrink batch_size in …

517655e

…uneven restart-fe case

JNSimba requested a review from Copilot May 14, 2026 03:23

Copilot started reviewing on behalf of JNSimba May 14, 2026 03:23 View session

Copilot AI reviewed May 14, 2026

View reviewed changes

Comment thread fe/fe-core/src/main/java/org/apache/doris/job/offset/jdbc/JdbcSourceOffsetProvider.java

github-actions Bot requested changes May 14, 2026

View reviewed changes

[fix](streaming-job) normalize bare currentSplittingTable in phase 1;…

594f7a2

… stabilize uneven sampler assertions

[fix](streaming-job) trim verbose comments in advanceSplits phases

bb8ff9a

github-actions Bot requested changes May 14, 2026

View reviewed changes

Conversation

JNSimba commented May 8, 2026

Summary

Changes

Tests

Test plan

Uh oh!

hello-stephen commented May 8, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

JNSimba commented May 11, 2026

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

JNSimba commented May 12, 2026

Uh oh!

JNSimba commented May 12, 2026

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

JNSimba commented May 12, 2026

Uh oh!

hello-stephen commented May 12, 2026

FE UT Coverage Report

Uh oh!

JNSimba commented May 12, 2026

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

hello-stephen commented May 13, 2026

Uh oh!

hello-stephen commented May 13, 2026

Uh oh!